Configurable Transient Fault Detection via Dynamic Binary Translation
نویسندگان
چکیده
Smaller feature sizes, lower voltage levels, and reduced noise margins have helped improve the performance and lower the power consumption of modern microprocessors. These same advances have made processors more susceptible to transient faults that can corrupt data and make systems unavailable. Designers often compensate for transient faults by adding hardware redundancy and making circuitand process-level adjustments. However, applications have different data integrity and availability demands, which make hardware approaches such as these too costly for many markets. Software techniques can provide fault tolerance at a lower cost and with greater flexibility since they can be selectively deployed in the field even after the hardware has been manufactured. Most existing software-only techniques use recompilation, requiring access to program source code. Regardless of the code transformation method, previous techniques also incur unnecessary significant performance penalties by uniformly protecting the entire program without taking into account the varying vulnerability of different program regions and state elements to transient faults. This paper presents Spot, a software-only fault-detection technique which uses dynamic binary translation to provide softwaremodulated fault tolerance with fine-grained control of redundancy. By using dynamic binary translation, users can improve the reliability of their applications without any assistance from hardware or software vendors. By using software-modulated fault tolerance, Spot can vary the level of protection independently for each register and region of code to provide users with more, and often superior, faultdetection options. This feature of Spot increases the mean work to failure from 1.90x to 17.79x.
منابع مشابه
Software Fault Detection Using Dynamic Instrumentation
In recent decades, microprocessor performance has been increasing exponentially. A large fraction of this performance gain is directly due to smaller and faster transistors enabled by improved fabrication technology. While such transistors yield performance enhancements, their lower threshold voltages and tighter noise margins make them less reliable [1], rendering processors that use them more...
متن کاملDiagnosis of Different Types of Air-Gap Eccentricity Fault in Switched Reluctance Motors Using Transient Finite Element Method
This paper presents a method for diagnosis of eccentricity fault in a switched-reluctance motor (SRM) during offline and standstill modes. In this method, the fault signature is differential induced voltage (DIV) achieved by injecting diagnostic pulses to the motor windings. It will be demonstrated by means of results that there is a correlation between differential induced voltage and eccentri...
متن کاملRobust Fault Detection on Boiler-turbine Unit Actuators Using Dynamic Neural Networks
Due to the important role of the boiler-turbine units in industries and electricity generation, it is important to diagnose different types of faults in different parts of boiler-turbine system. Different parts of a boiler-turbine system like the sensor or actuator or plant can be affected by various types of faults. In this paper, the effects of the occurrence of faults on the actuators are in...
متن کاملApplication of Thau Observer for Fault Detection of Micro Parallel Plate Capacitor Subjected to Nonlinear Electrostatic Force
This paper investigates the fault detection of a micro parallel plate capacitor subjected to nonlinear electrostatic force. For this end Thau observer, which has good ability in fault detection of nonlinear system has been presented and governing nonlinear dynamic equation of the capacitor has been presented. Upper and lower threshold for fault detection have been obtained. The robustness of th...
متن کاملTransient Detection and Analysis for Diagnosis of Abrupt Faults in Continuous Dynamic Systems
TRANSCEND, our system for fault detection and isolation of complex dynamic systems, uses a model based approach to predict and analyze transient effects resulting from abrupt faults in the system. Abrupt faults are attributed to discrete and persistent parameter value changes. Fault isolation is performed by matching features extracted from the transients against those predicted by the model. T...
متن کامل